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The worst situation in computing the minimal nonnegative solution of a 
nonsymmetric algebraic Riccati equation associated with an M-matrix occurs 
when the corresponding linearizing matrix has two very small eigenvalues, 
one with positive and one with negative real part. When both these eigen- 
values are exactly zero, the problem is called critical or null recurrent. While 
in this case the problem is ill-conditioned and the convergence of the algo- 
rithms based on matrix iterations is slow, there exist some techniques to 
remove the singularity and transform the problem to a well-behaved one. Ill- 
conditioning and slow convergence appear also in close-to-critical problems, 
but when none of the eigenvalues is exactly zero the techniques used for the 
critical case cannot be applied. 

In this paper, we introduce a new method to accelerate the convergence 
properties of the iterations also in close-to-critical cases, by working on the 
invariant subspace associated with the problematic eigenvalues as a whole. 
We present a theoretical analysis and several numerical experiments which 
confirm the efficiency of the new method. 

1 Introduction 

We consider the nonsymmetric algebraic Riccati equation (or NARE) 

XCX - AX - XD + B = 0, (1) 

where X,B (£ C mxn , A G C mxm , C G C nxm , D G C nxn . We write equation © briefly 
as TZ{X) = where U{X) = XCX - AX - XD + B. 
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In certain applications in queueing models [20] and in the numerical solution of trans- 
port equations [18] , the coefficients of ([1]) are such that 



M 



D 

-B 



-C 
A 



is an M-matrix, either nonsingular or singular irreducible. In this case, we give Equation 
(PQ) the acronym M-NARE. We recall that M £ C nxn is an M-matrix if it can be written 
in the form M = sl n — N, where I n is the identity matrix of size n (denoted also by I if 
there is no ambiguity), N is a matrix whose elements are nonnegative, for which we use 
the notation N ^ 0, and s ^ p(N), where p(-) is the spectral radius of a square matrix. 
The M-matrix M is singular if s = p(N) and nonsingular if s > p(N). It can be proved 
that the eigenvalues of an M-matrix have nonnegative real part [I] . 

The solutions of the NARE ([!]) can be put in correspondence with certain n-dimensional 
invariant subspaces of the matrix 



H 



D 
B 



-C 

-A 



(2) 



More precisely, a matrix X £ C mxn is a solution of ([T]) if and only if the columns of 



X 



span an invariant subspace of H. In particular, it holds that 



H 



'In 




'In 


X 




X 



(D-CX), 



(3) 



and the eigenvalues of D — CX are a subset of the eigenvalues of %. 

We say that the NARE (TTJ) is associated with the matrix H of ([2j) or that % is the 
linearizing matrix of the NARE. Observe that any 2x2 block matrix with square diagonal 
blocks yields a NARE associated with it. 

In the case of an M-NARE it can be proved that the eigenvalues of T-L can be ordered 
by non increasing real part such that 



»Ai ^ • • • ^ KA n _i > A n > > A n+ i > • • • > KA 



m+n i 



(4) 



that is, n eigenvalues belong to the closed right half complex plane and the other eigen- 
values to the closed left half plane, and the eigenvalues A n and A n+ i are real. If moreover 
M. is irreducible, then 5RA n _i > A n ^ ^ A n+ i > KA n+ 2- If A ra = = A n+ i, then the 
eigenvalue zero is associated to a size-2 Jordan block (see [1] and the references therein). 

All these spectral properties implies that the matrix 7~L associated with an M-NARE 
has a unique n-dimensional invariant subspace corresponding to the n rightmost eigen- 
values, namely Ai, . . . , A n , which we call the n-dimensional antistable invariant subspace 
of H (the term comes from the theory of the symmetric algebraic Riccati equations in 
dynamical systems [19]). On the other hand, the matrix 7i has a unique m-dimensional 
invariant subspace corresponding to the m leftmost eigenvalues, namely A n+ i, . . . , A n+m , 
which we call stable invariant subspace. 
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In the applications, the required solution of the M-NARE is the one for which the 

columns of " span the n-dimensional antistable invariant subspace of Ti, or, equiv- 
Ji 

alently, such that the eigenvalues of D — CX are the n rightmost eigenvalues of T~L. 
This solution has been proved to exist and it turns out to be the minimal element-wise 
nonnegative solution of (see [TO]). 

Equation ([TJ) is usually solved either by some matrix iteration, e.g., the Cyclic Reduc- 
tion (CR) [2 J or the Structured Doubling Algorithm (SDA) 0Q3], whose limits yield 
the required solution or using the ordered Schur form of H [UJ . 

Both the conditioning of the equation and the convergence speed of most iterations are 
strictly related to some property of good separation between the stable and anti-stable 
subspace; for this purpose, several different measures of "nearness" are used in literature; 
in Section [2] we introduce and discuss them. Nevertheless, all approaches identify two 
important cases: 

1. A n = A n +i = 0; in this case, the minimal nonnegative solution of (JJ) is ill- 
conditioned [12] and the convergence of most iterations degrades from quadratic 
to linear [7J. This case is known as critical case. Most of this problems can be 
circumvented by using the so-called shift technique [131 E] • It consists in making a 
special rank-one correction of H, obtaining a new Riccati equation with the same 
minimal solution. The new equation has better conditioning and the convergence 
of iterations is quadratic again. The shift technique works not only in the critical 
case, but also when only one of X n and A n _|_i is zero, yielding minor benefits in this 
case. 

2. equations that are (in some sense) "close" to having A n = A n +i = 0, while neither 
of the two eigenvalues is exactly zero. This case is known as close to critical. It 
is effectively the worst-case scenario, since the same difficulties as in the critical 
case appear (ill-conditioning, slow convergence of the numerical methods based 
on matrix iterations), but the shift technique cannot be applied as it requires an 
eigenvalue to be exactly zero. 

The difficulties in the latter case are the main motivation for this work. We present a new 
technique, which we call subspace shift, that aims to extend the shift technique to this 
case. A necessary assumption is that we can identify a small subset of the eigenvalues 
which are "responsible" for the ill-conditioning of the equation and well separated from 
the rest of the spectrum; we define this property more precisely in the following. We 
call the associated subspace central subspace. The dimension k of this subspace may be 
known a priori from the theoretical properties of the problem (as for instance in |18j), 
or determined at run-time, which is a more difficult task. 

The technique consists in building a rank-/c modification of the matrix that alters 
the eigenvalues associated with the central subspace, but does not modify the invariant 
subspaces and the minimal solution. In this way, we reduce the problem to a "far from 
critical" one, for which the solution can be computed with a faster and stabler iterative 
method. 
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The paper is organized as follows. In Section [21 we describe and compare the different 
notions of distance from criticality which exist in literature and how they affect the 
conditioning of the problem and the convergence speed of the numerical algorithms. 
In Section [3] we introduce the subspace shift technique and outline some results that 
give an insight of its behavior in terms of the different criticality metrics. In Section [5] 
we describe its implementation in more detail and discuss the computational aspects. 
Section [5] contains some experimental results that show the effectiveness of this technique. 
Finally, Section [U] draws the conclusions. 

In the following, o~(M) stands for the set of the eigenvalues of M £ C nxn , and |j-||^ 
denotes the Frobenius norm. We define the Cayley transform of parameter 7 € M \ {0} 
as the map 

r ■ ^ z ~ 7 

7 z + 7 

Notice that, for 7 > 0, C 7 maps the open (closed) right half-plane onto the open (closed) 
unit circle, and the open (closed) left half-plane onto the exterior of the open (closed) 
unit circle. 



2 Measures of criticality: gap and sep 
2.1 The gap between eigenvalues 

The simplest measure of criticality, adopted in most works on the shift [T3] is the 
so-called gap, i.e., the distance between the two eigenvalues closer to the imaginary 
axis gap(%) := |A n — A n+ i| . In the critical case gap(%) = 0, and a problem is called 
close-to-critical when gap(%) is small with respect to ||%||. A strictly related quantity, 
which appears explicitly in the expressions for the convergence speeds of SDA and Cyclic 
Reduction [1], is its Cayley-transformed version 

maxj = i n |C 7 (Aj)| 

gapc 7 (^) := — lr h 77, (5) 

where 7 is chosen according to 

7^7* = max < max ass, max dss > . (6) 

We have gap^('H) ^ 1, with equality in the critical case. Since the Cayley transform 
alters the position of the eigenvalues, it is not apparent that the minimum and maximum 
in ([5]) are attained in A ra and A n+ i; we give here a proof of this result. Let us first assess 
the following technical lemma. 

Lemma 1 Let Y be a closed disc in the complex plane with center C£R and radius r. 
The point in T with maximal modulus is one among C + r and C — r. 

Proof. Let C + pGC, \p\ ^ r, be a generic point in the disc. By the triangle inequality, 
[C + p| ^ |C| + \p\ ^ |C| + r, with equality if and only if \p\ = r and p has the same 
argument as C, i.e., either real positive or negative. □ 
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Theorem 2 Let Tl be associated with an M-NARE, and 7 be chosen according to ([6]). 
The minimum and the maximum in ([5]) are attained by i = n and j = 1, i.e., we may 
replace ([5]) with 



gaPCv(^) : 



|C 7 (A n ) 



|C 7 (A n+1 )| 

Proof. From ([6|) we have 7/ — D ^ and thus P = 7/ — D + CX* ^ 0. Hence we may 
write D — CX* = jl — P; from the Perron-Frobenius theory of M-matrices, it follows 
that all the eigenvalues of D — CX* are contained in the closed disc with center 7 and 
radius r = 7 — A n ; and in particular, the eigenvalue A n lies on its boundary. We call this 
disc T, and proceed to prove that C 7 (A n ) has the maximal modulus among all points in 
in C-y(r). The image of Y under the Cayley transform is a closed disc V , which must be 
contained in the unit disc and symmetric with respect to the real axis. This means that 
its center (which is not in general C 7 (7)) is real. This disc V intersects the real axis in 
the two points C 7 (A n ) and C 7 (27 — A n ). By the lemma, the point of maximal modulus 
in r' is one among them; direct computation (using A n ^ 7) shows that it is the former. 

A similar reasoning starting from A—X*C yields that minj=i ... :m |C 7 (A n +j)| is achieved 
by j = 1; we need some extra care with the signs, as A n+ i ^ 0, and with the fact that 
this time the image of the enclosing disc under the Cayley transform is the outside of a 
suitable disc. □ 

Therefore an explicit relation among the two concepts of gap can be established. 
Observe that while the gap changes by scaling the matrix T~i by a real parameter 
the Cay ley-transformed gap of aH is the same as the one of 7i, if the same value of 7 
is chosen according to Thus, the Cayley transformed gap can be seen as a relative 
inverse gap. 

2.2 The subspace separation 

A third, more accurate notion of nearness is given by the subspace separation I21j. 
We first define the separation between the two square matrices M and as 

l% , Ar , . \\MX-XN\\ 

sep(M, N) := mm- — — (7) 

x^o \\X\\ 

where ||-|| is a suitable matrix norm (for instance we denote by sep^ and sep 2 the 
separation in the Frobenius and spectral norm, respectively), and recall the bound 

sep(M, N) < min \fi-v\- (8) 

^ea{M),uecr(N) 

Given an invariant subspace W for a matrix A E <£(n+m)x(n+m) ^ ^ q ^ e an un itary 
matrix such that 



Q*AQ 



An A 12 
A 22 



A u eC nxn , A 22 GC mxm , (9) 
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and the first n columns of Q span W. We define sep(W) := sep(An, A22) and relsep(W) := 

scp(VV) 

\\a\\ • 

When A = T~i and W is the anti-stable space, relsep(W) gives a third measure of the 
distance of % from the critical case. The conditioning of the Riccati equation depends 
essentially on this separation measure, as shown by the following results. Let the distance 
between two subspaces be defined as dist(W, V) := \\Pu — PvW-, with Py\> the orthogonal 
projection on the image of W. 

Theorem 3 (|21j) Let W be an invariant subspace of a matrix % and W be an invari- 
ant subspace ofH = 1-L + E, where \\E\\ ^ e \\H\\. Then, for all sufficiently small e we 
have 

(10) 



dist(W,W) ^ C 
for a suitable constant C of moderate size. 



relsep W 



The result is stated in a stronger form in [21], using the norms of Eyi and Hyi in two 
suitable block partitions of the involved matrices, but here we favor this form for the 
sake of simplicity. 

A bound on the subspace distance can be transformed into a bound on the solutions 
of the associated Riccati equations. 



Lemma 4 Let W = span 



norm, we have 



X -X 



, W = span 



\InV + \\X\ 



In 

X 

jxV2 



Then, for both spectral and Frobenius 



\I II 2 + 

\ 1 n\\ 1 



X 



1/2 



dist(W,W). 



Proof. We use the characterization of dist given in [91 Theorem 2.6.1]; the proof there 
refers to the spectral norm, but it can be adapted to the Frobenius norm. We apply the 
result to the orthonormal matrices 



Wo 



-X* 
I 



(L + xx*y 1/2 , 



(L + x*xy 1/2 , 



then use norm submultiplicativity 



X -X 



(L + X*X) 1/2 



(l + x*xy 1/2 [-x i] 



(L + x*xy 1/2 



(L + X*X) 1/2 



Finally, the terms || (7 + X*X)- 1 / 2 \\ and (I + X*X)- 1 / 2 
desired form by taking an SVD of X and X, respectively. 



can be transformed into the 

□ 



A tighter bound, which still behaves essentially as 0((relsepW) 1 ), can be found in 

era. 



6 



2.3 Relations between gap and sep 

If the Frobenius norm is used, the gap(%) and the sep(W) are the smallest eigenvalue 
and the smallest singular value of the matrix (1 An — A 22 <8> I), respectively (compare 
([7]) and Q). It is well known that the two numbers coincide for normal matrices |21|, 
Exercise 5.2.1], but may differ significantly in the nonnormal case [214 Example 5.2.4]. 
The same happens for any norm, and in view of ([8]) we obtain that sep(W) ^ gap(%). 

For nonnormal matrices the subspace separation is a better tool to gauge the distance 
from criticality, especially when conditioning properties are in exam. However, as far as 
we know, all the literature regarding shift methods deals only with the gap as a measure 
of criticality. The geometrical intuition is clearer in the gap setting and the proofs are 
easier to carry on. On the other hand, the whole point of shifting strategies is getting 
rid of the two real eigenvalues A n and A n +i close to the origin, and it is less clear how 
we should define the gap in when the two eigenvalues have been moved. In view of ([8]), 
we extend the definition of gap in the following way. Let W be an invariant subspace of 
a matrix A, and An, A22 as in ([9]). We set 

gap(W) := min |A — fi\ . 

\eo-(A 1 i),fj,€o-(A 2 2) 

Similarly, we define the gap between two invariant subspaces of A as 
gap(ZY, V) := min [A — y\ . 

A associated to U, fi associated to V 

In the following, we define our subspace shift technique in terms of the gap metric, 
because only by resorting to eigenvalue location criteria we can select suitable subspaces 
for its application. When discussing conditioning, moving to the sep setting (and having 
good separation properties in this setting) is necessary. However, as far as we know, 
even a complete theory of the basic shift technique in terms of sep does not exist at 
present; the intuitive assertion that "things get better when we move the eigenvalues 
more far apart" is difficult to formalize in terms of the sep metric. We are not able to 
give full proof for many of the conditioning-related assertions, but we provide at least 
partial ones that show our claims when the separation bounds behave as suggested by 
the gap metric analogy. This in particular includes the case in which % is normal or 
departs only slightly from normality. 

3 Theoretical bases 
3.1 The shift technique 

The shift technique has been applied in |13| to the M-NARE ([1]) where Ai is a singular 
irreducible M-matrix, that is, when at least one between A n and A n +i is 0. Without loss 
of generality one can assume that A n = 0: the case A n > = A n+ i can be reduced to 
the case A n = by a simple trick [131 Lemma 5.1]. 
The shift technique is rooted in the following results. 
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Lemma 5 (Brauer's theorem |6j) Let (X,v) be an eigenpair for the matrix T. Let u 
be a vector with u*v = 1 and s be a scalar. The eigenvalues of the matrix T :=T + svu* 
are the same as those ofT, except for one occurrence of X which is replaced by A + s. 

Theorem 6 ([13]) LetU be the as in (J2J) associated with the M-NARE JU with X n = 0, 
and let v n be an eigenvector relative to X n ; consider the matrix 

H := H + sv n u*, 

with u*v n = 1 and s > 0. Then, the minimal solution X* of the M-NARE associated 
with Ti is a solution of the NARE associated with %. Moreover, m eigenvalues of H 
lie in the closed left half plane and n in the open right half plane, whose corresponding 

invariant subspace is spanned by the the columns of v . 

A, 

The shift technique consists in computing one eigenvector v n corresponding to the eigen- 
value A n = 0, and using it to construct the NARE associated with 7i, which we call the 
shifted NARE. The matrix "H has eigenvalues X±, . . . , A n _i, A n , A n +i, . . . , X n+m where 
X n = s (the eigenvalue X n has been "shifted" from to s, this justifies the name of 
the technique). Observe that gap(%) > gap(%); thus, better conditioning and faster 
convergence are expected, once the Cayley parameter 7 is fixed. It has been proved in 
[13j that SDA applied to the shifted equation, using the same Cayley parameter as the 
nonshifted case, but with the initial values as in ()14p . constructed from T~L, converges 
quadratically with a better rate of convergence than the nonshifted case. Numerical 
experiments [T3], |3] show that this technique reduces dramatically the number of steps 
of iterations like SDA and CR. In the critical case, the convergence from linear becomes 
quadratic. 

It can happen that the Riccati equation associated with % is not an M-NARE; that 
is, A4 = [q _Pj] T-L need not be an M-matrix. Hence, there is no guarantee that the SDA 
can be carried out without breakdown, even if in practice this method works well and 
the applicability of SDA is usually assumed |13j . 

Since A n = 0, the vector v n can be computed easily as ker Ai. In principle, the shift 
technique could be used also for nonsingular M-matrices, i.e., the hypothesis X n = is 
not actually needed in Theorem [6) In this case there is no simple relation among the 
eigenvectors of Ad and "H, and different techniques are needed for the computation of 
X n and v n ; for instance, the power method. However, in the close-to-critical case the 
eigenvector v n is ill-conditioned and therefore it cannot be computed with good accuracy. 

To solve this problem, we present in Section 13.31 a new technique which shifts a whole 
invariant subspace containing A n and A n+ i, without attempting to separate the eigen- 
vectors. To this purpose we use an invariant subspace whose associated eigenvalues are 
well separated from the rest of the spectrum. 

3.2 Results on separation 

We first provide a couple of simple lemmas that will be used in the following. 
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Lemma 7 Let the spectrum of M £ Qnxn ^ e ^ e un { on j f wo disjoint sets, say Ai 
and A2. Let the columns of U and V (with full column rank) span the left and right 
invariant subspaces of M corresponding to the eigenvalues in Ai, respectively. Then 
U*V is nonsingular. 

Moreover, let the columns ofW (with full column rank) span a right invariant subspace 
of M whose corresponding eigenvalues belong all to A2, then U*W = 0. 

Proof. Let L~ l ML = J be the Jordan canonical form of M where the Jordan blocks 
are ordered such that the k eigenvalues in Ai come first, then, a basis of the right 
invariant subspace of M corresponding to Ai is made by the first k columns of L, say 
L\. Thus V = L\P for some nonsing ular P € C kxk . Similarly, U* = QR\ where R\ 
are the first k rows of and Q € C fcxfe is nonsingular, thus U*V = QR\L\P = QP 
is nonsingular. By a similar argument, it can be proved that left and right invariant 
subspaces corresponding to different eigenvalues are orthogonal. □ 



Lemma 8 Let 

where a (An) n (7(^22) = 
1. There is a matrix 

with \\Z\\ ^ 



A 



An A u 
A 22 



I Z 
/ 

1 



such that Z e AZ e = di&g(An, A22 



sep(An,A 2 2) 

2. For each B, sep(A n ,B) ^ sep(A, B) and sep(A 2 2,B) ^ sep(A,B). 

3. sep(M, N) ^sep(TiMT^ 1 ,T2NT 2 1 )n(Ti)K(T 2 ). 

4. IfM = diag(Mi, M 2 ) and N = diag(Ni,N 2 ), then sep(M, N) = min ij= i i2 sep(Mi,Nj) 

Items [H [3] and S] are in [2T]. Item [5] follows from the definition of sep, by noting that the 
minimum of AX — XB increases if we restrict to matrices X, partitioned conformably 
with A as X = \X\ X2] , where one of the two blocks is zero. 

3.3 The subspace shift technique 

Let V and V be two complementary invariant subspaces of H satisfying the following 
conditions: 

1. V has small dimension k. 

2. V is well separated from V, i.e., gap(V, V) ^ 8%, for a 81 > not excessively small. 

3. V does not contain eigenvalues close to the imaginary axis, i.e., gap(V s ,V u ) ^ 82, 
for a 82 > not excessively small, where V s and V u are the invariant subspaces 
associated with the stable and anti-stable part of V. 
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Let V, U G c( n+m ) xfc be matrices whose orthonormal columns span respectively V and 
the left invariant subspace U associated with the same eigenvalues as V. Notice that 
U is well defined, since the subset of <j(H) associated to V and V are disjoint because 
gap(V,V) > 0. By LemmaEl U*V is nonsingular. 
We consider the matrix 

n = n{i + sV{U*V)- l U*)i (11) 

which is a rank k modification of %. Its spectral properties are summarized by the 
following result. 

Theorem 9 Let the spectrum of H G £,( n + m ) x ( n + m ) fr e the union of two disjoint sets, 
say Ai = ...,&} and A 2 = {&+i, . . . ,£ n+m }. Xei V, Z7 G C("+ m ) xfc 6e matrices 
whose orthonormal columns span the right and left invariant subspaces associated with 
the eigenvalues of h\, respectively. The matrix Ti in (|lip has the same right invariant 
subspaces as % and eigenvalues {(1 + s)£i, . . . , (1 + s)^, . . . , £ n +m}- 

Proof. As above, let V be the right invariant subspace of % corresponding to the 
eigenvalues in Ai and let V be the right invariant subspace complementary to V, corre- 
sponding to the remaining eigenvalues {£fe+i, . . • , Cm+n}- Let Wi C V be a right invariant 
subspace of Ti spanned by the column of the matrix W\ G c( n + m ) x ^i ; then W\ = VQ 
for some matrix Q G C fcx ^ 1 , thus 

UW X = U{Wi + sy^y)" 1 ?/*^) = H(Wi + sVQ) = + s), 

and thus {(1 + . . . , (1 + s)^} are eigenvalues of H. 

Let W2 C V be a right invariant subspace of T~L spanned by the column of the matrix 
W 2 6 C( n+m ) xfe , then by Lemma we have C/*W 2 = and hence UW 2 = UW 2 . Thus, 
. . . ,Cm+n} are eigenvalues of 7^. Since any right invariant subspace of H can be 
written as the sum of two invariant subspaces contained in V and V, respectively, the 
proof is completed. □ 

Theorem [9] can be applied to the linearizing matrix H of a close-to-critical M-NARE 
([1]), where V is chosen to be a subspace containing both A n and A n +i. Then the anti- 
stable invariant subspace W and thus the Riccati solution X* of the NARE associated 
to j-L are the same as the ones for ([T|). 

From the point of view of the eigenvalue location and of the gap metric, the behavior of 
the subspace shift technique is clear: the eigenvalues closer to the imaginary axis, which 
are responsible for the slow convergence and ill-conditioning, are multiplied by a factor 
1 + 3, which takes them farther from the imaginary axis and thus improve the gap. If the 
factor 1 + s is not too large then the Cayley gap is reduced and the numerical algorithms 
based on matrix iterations converge faster. In order to give a complete treatment of the 
stability properties of the technique, we have to resort to the separation metric. 
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3.4 Conditioning of U*V 



For the subspace shift technique, we need to form (U*V)~ 1 ; therefore, it is crucial that 
the condition of this matrix is not worse than the conditioning of the problem we are 
solving. Similarly to what happens in the original problem, we can relate its conditioning 
to the separation between V and V. 

Theorem 10 Let U and V be orthonormal bases for U and V as defined above. Then, 
we have 

||(?7*F) _1 || Crelsep(V)" 1 + D 



for moderate constants C,D > 0. 

Proof. Perform an orthonormal change of basis so that V 



and partition 



n 



'An A 12 
A 22 



Let Z as in item [T] of Theorem [SJ in particular we have \\Z\\ ^ relsep(V) . Notice 
that (/ + ZZ*)^ 1 / 2 [/ — Z] is another orthonormal basis of U, thus U* = Q(I + 
ZZ*)~ l l 2 [/ — Z\ for a suitable unitary Q. We have then 



\{U*V) 



-i 



(i + zz*) 1 /' 2 = (||/|| 2 + 1|^|| 2 ) 1/2 , 



where the last equality can be established as in the proof of Lemma HI 



□ 



3.5 Conditioning of the shifted problem 

While not a formal proof, the following argument can be used to gauge the conditioning 
of the shifted problem. Apply an orthogonal change of basis to bring H in the form 

~V a G_ a * *' 

v a * * 
V s G s ' 

v.. 

where o-(V s ) and a(V a ) contain respectively the stable and antistable eigenvalues associ- 
ated with V, and cr(V s ), cr(V a ) the stable and antistable eigenvalues associated with V. 
We may find a matrix R a such that 

R [ V a G a 
a [ V a 

and K(R a ) = o( — ^^j - . 2 ~\ . If g&p(V a , V a ) is a good approximation of the corresponding 
sep, or if V and V are well separated also in the sep sense, we expect this condition 



R-^dmgiVaX), 
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numbers to be moderate. We argue similarly for the blocks indexed by s, and construct 
a matrix R s which annihilates G s . We let R := diag(i? a , R s ). 

Using the special block structure of RHR^ 1 , we may construct the matrices U and V 
needed for the shift technique as 



V =R~ X 



I * 

* 

/ 





U =R 



I 











* 


I 


* 






with U*V = I. Therefore T~L takes the form 



H = R~ 



\s + l)V a * * 

v a * * 

(s + l)V s 

V, 



R, 



where the entries marked with an asterisk might be affine functions of s. Thanks to 
Lemma El the separation of the anti-stable subspace in H is 



sepW ^ 



min (sep((s + l)V a , (s + 1)V 8 ), sep((s + l)V a , V s ),sep(V a , (s + 1)V,), sep(K, V s )) 

K(R a )K(R s ) 

(12) 

If s is moderate and good separation properties hold among V and V, and between the 
stable and unstable part of V, then we may expect, in analogy with the gap setting, 
that this minimum is attained by the first element, i.e., sep W (s + 1) K ^ R * sepW. 
This means that, up to factors that depends on the separation of the central subspace 
sep V, the conditioning of the shifted problem is improved by a factor s + 1. 

Unfortunately, a full proof of the fact that the minimum in (|12|) is attained by the 
critical subspaces seems elusive. As far as we know, there are no results in literature 
concerning the behavior of the separation when a scaling is applied to one of the two 
matrices in a way that takes the eigenvalues "more far apart" (in some suitable setting) . 
The analogy with the gap setting and the experimental results in Section [5] seem to back 
up our claim. 



3.6 Stability under perturbations of the computed central subspace 

With a similar reasoning, we may try to estimate the impact of errors in the computed 
bases for the left and right central subspaces on the accuracy of the solution. We choose 
a basis as in Theorem [101 If we use perturbed versions of U* and V as 



V 



I + E 1 



U* 



[I + Fi -Z + F 2 ] 



we obtain instead of % 
H 



(s+l)#ii+s(£i#ii+#iiFi+£iffiiFi) H 12 -sH 11 Z+s(-E 1 H 11 Z+H 11 F2+E 1 H 11 F 2 ) 
sE 2 H 11 (I+Fl) H22+sE 2 H 11 (-Z+F 2 ) 
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If \\Ei\\ , < e, then 



n-n 



Kse \\Z\\ 11% II for a moderate constant K. From the 



reasoning in the previous section, we expect relsepW ^ k.(Ro)k(R s ) re i se pyy i j thus 
by (fTU|) the computed subspace W satisfies 

dist(w , w) s c^SiP = ^^W^lMf. 

relsep W 5 + 1 relse P W 

The factor ^fj is bounded by 1, and this bounds differs from the perturbation bound 
()10p for the nonshifted problem only by factors depending on relsep V. 



4 The SuShi (Subspace Shift) algorithm 

We assume that the space V corresponding to the smallest in modulus eigenvalues of 
T~L verifies the assumptions stated at the begin of Section 13.31 We call V the central 
subspace of %. 

In this case, the subspace shift technique of Section \3. 31 can be easily translated into a 
numerical algorithm for solving close-to-critical M-NARE (pTJ) . We call it SuShi (Subspace 
Shift) algorithm. 

Algorithm 1 SuShi algorithm for the solution of a close-to-critical M-NARE 
1: choose k 

2: compute two matrices U, V with orthogonal columns which span the left and right 
invariant subspaces of T~L corresponding to its k smallest in modulus eigenvalues, 
£i,...,£fc, respectively 

3: choose s > and compute H = U{I + sV(U*V)- l U*) 

4: solve the NARE 1Z(X) = associated with ~H, to get the minimal nonnegative 
solution X* of the original M-NARE TZ(X) = 



Algorithm 1 is very simple. Nevertheless, in order to get a decent implementation, the 
details need to be tackled with some care. One could ask how to dynamically determine 
the value of k, how to compute U and V, how to determine the value of s and how to 
efficiently solve the "shifted" NARE 1Z(X) = 0. This is the topic of the next sections. 



4.1 Computation of the central invariant subspaces 

The central left and right invariant subspaces are the ones corresponding to the smallest 
in modulus eigenvalues of 7i, then the inverse orthogonal iteration of [HI Section 9.3.2] 
applied to H and %*, respectively, converges to these subspaces. Notice that we assume 
that 7~L is nonsingular so that the customary shift technique cannot be used. 

The luckiest situation arises when k = 2 and the eigenvalues X n and A n+ i of % are 
the smallest in modulus and well separated from the others, as in the problem of [18) . 

In the general case, setting k = 2 may not yield the desired results since we could 
have close-to-critical settings such that X n and A n _i are very close to each other and 
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to zero, and A n +i slightly larger in modulus than both of them. Moreover, even if we 
shift away X n and A n+ i, the remaining eigenvalues of % (e.g., A ra _i in a setting similar 
to the case above) could be very close to them, and thus the convergence speed of the 
Riccati solvers based on matrix iterations is almost unchanged; therefore, the subspace 
shift with k = 2 can still be applied but is not much useful. 

For these reasons, we would like to handle cases where k may be larger than two, 
this means that there is some eigenvalue different from A n and A n +i near to the origin. 
Our minimal assumption is that there exists a value of k ^ 2 such that the k smallest 
eigenvalues of H are near to each other (close-to-critical case) and well separated from 
the (2n — k) largest eigenvalues (compare the assumptions of Section \3.3\i . 

If the value of k is not known in advance it can be determined using the following 
strategy: start the inverse orthogonal iteration with k = 2, estimate its convergence 
speed after some steps. If the iteration shows itself too slow, enlarge k until the conver- 
gence becomes (possibly) fast. This approach yields together V and k. The matrix U 
can be obtained applying a subspace iteration to %*. 



4.2 Magnitude of the shift parameter 

Another issue which appears in the practical implementation is the selection of the shift 
parameter s in Algorithm 1, which should be automatic to deal with different problems. 

A major concern is that if the chosen value of s is too small, then the two central 
eigenvalues do not move significantly and the gap remains small; on the other hand, if 



the shift parameter is excessively large then 



n 



grows, and the conditioning of the 

F 

shifted Riccati equation degrades, according to the results of [2j 

Let £i, . . . ,£ m + n be the eigenvalues of H ordered by nondecreasing modulus. When 
the main objective is the acceleration of matrix iterations, like the SDA, the value of s 
should be chosen such that the eigenvalues corresponding to the central subspaces, say 
£i, • • • ,£fc, which are responsible of the slow convergence, gets far from the imaginary 
axis. 

We need to estimate how small they are with respect to the other eigenvalues. We may 
get an estimate using the convergence speed of the inverse orthogonal iteration. Notice 
that, with our assumptions on the eigenvalues, the convergence rate is determined by 

, |&| 



IGh-iP 

hence a rough estimate for t is given by comparing successive iterates of the subspace 
iteration. The value (or £i), for small k, can be easily computed, since it is the 
largest in modulus eigenvalue of the k x k matrix V*HV, an alternative approach is to 
approximate it using power methods or some steps of the Arnoldi iteration. 

Once an approximation of t has been computed, if we choose s such that (1 + s)£i > 
then all the eigenvalues (1 + . . . (1 + s)^ become larger in modulus than 5. 
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4.3 Solution of the shifted equation 

A popular algorithm for computing the minimal solution of an M-NARE is the Structured 
Doubling Algorithm (SDA), which, in the formulation of [IB], is a system of rational 
matrix iterations defined as 

Ek+i =Ek{I — GkHk) 1 Ek, 
Fk+i =Fk(I - HkGk) 1 Fk, 
Gk+i =Gk + E k {I — GkHk^GkFk, 
Hk+i =Hk + F k (I - H k Gk) 1 H k E k , 

with suitable initial values E £ C nxn , F £ C mxm , G G C nxm , H G C mxn . We 
say that the SDA is applicable (for a set of initial values Eq, Fq, Gq, Hq) if the matrix 
/ — GkHk (or equivalently / — HkGk) is nonsingular for each k otherwise we say that 
the SDA has a breakdown. 

In the case of the M-NARE, choosing the initial values of the SDA as 

E =/ - 2jV~\ F =1 - 2^W-\ 

G =2 1 D~ 1 CW~\ H =2 7 W~ 1 BD-\ 

A y =A + 7/, D y =D + 7/, 

W 1 =A 1 - BD^C, V 1 =D y - CA^B, 



(14) 



for 7^7*, defined in (J6j) , yields well defined sequences such that Gk — > A* and H}~ — > 
where X* is the minimal nonnegative solution of the M-NARE, while is the minimal 
nonnegative solution of the dual M-NARE: YBY - Y A - DY + C = 0. In the critical 
cases the convergence is linear, while in the other cases is quadratic with rate 

^ = gapc 7 (^)= ,' C ;[ An) l - (15) 

\L-y{A n+ l)\ 

Moreover, the value of 7 ^ 7* that yields faster convergence is 7 = 7* (see fj] and the 
references therein). 

The SDA can be applied with minor modifications to the equation associated with the 
subspace shifted matrix 

r D -C 
B —A 



U 



It is enough to start the SDA with Eq, Fo, Go, Ho obtained using formulae (fTl"]) with the 
new coefficients A, B, C, D instead of A, B, C, D and with the same 7. 

The new matrix % might not be an M-matrix so that the applicability should be 
assumed, in this case one can prove that the method converges to the same limit matrices 
with a rate which is ^ 

max i=lj ... in |C 7 (Aj)| 



mm 



j=l,...,m \&y\\n+j)\ 



(16) 
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where Ai, . . . , \ n +m are the eigenvalues of H. The proof of convergence may be obtained 
with similar manipulations as the ones of |16} I13j. so we decided to omit it. 

Using the same parameter 7 for both standard SDA and SDA applied to the shifted 
equation gives different convergence rates according to (|15p and (|16p . An acceleration is 
obtained in the shifted case if v < v. On the assumption that the central eigenvalues are 
near to the origin, there exists s max such that v < v for any value of the shift parameter 

<C S <C Smax- 



5 Numerical experiments 

We present some numerical examples showing the effectiveness of the subspace shift 
technique, through the algorithms presented in Section in solving close-to-critical 
nonsingular M-NARE, when the assumptions of Section 13.31 are fulfilled; that is, when 
the k eigenvalues corresponding to the central subspaces are nonzero and well separated 
from the other eigenvalues of %. We recall that these assumptions can be identified 
dynamically by the algorithm. 

We compare the SDA applied to the original equation and to the shifted one. We 
report the number of steps required by the SDA to converge and the number of steps of 
the inverse orthogonal iteration in the subspace shift algorithm. All steps of the SDA 
and the first step of the inverse orthogonal iteration are the most expensive part of the 
algorithms, since their asymptotic cost is cubic with respect to the size of the matrices; 
for instance, for m = n, the cost of a step of the SDA is 0(n 3 ) elementary arithmetic 
operations. 

We estimate the accuracy of the computed solution X* by means of the relative error 

X* — X* 



where A*, if available, is the exact solution or an approximation of it obtained using a 
higher precision. Otherwise, we use the relative residual 





K{X*) 


F 


X^CX* + B 


+ 

F 


AX* + X*D 


F 



In our experiments the Frobenius norm is used. In the tests we use the value 10 as 
stopping tolerance, a lower tolerance has been used sometimes to monitor the error. 

Test 1 As a first test, we consider the close-to-critical cases of the transport problem 
treated in |14t [T8| [5]. It is an M-NARE with square coefficients of size n depending on 
two parameters ^ a < 1 and < c ^ 1 (for the complete definition and the meaning 
of the parameters see [18J). The problem is critical for [a, c) = (0, 1), and it is close-to- 
critical if a and c approach simultaneously and 1, respectively. In this problem k = 2 
and the two eigenvalues A n and A n+ i are well separated from the others. 
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gap(W) 


rsep(>V) 


gaPc 7 (^) 


rsep(W) 


gap(W) 


rsep(W) 


gaPc^(^) 


10~ 3 


0.11 


4.5 • 10~ 3 


0.98 


2.9 ■ 10" 2 


2.5 


2.3 ■ IO" 2 


0.69 




3.5 • 10~ 3 


1.4- 10" 4 


0.9995 


2.9 • 10~ 2 


2.5 


7.1 • 10~ 4 


0.69 


io- 12 


3.5 • 10~ 6 


1.4- 10~ 7 


« 1 


2.9 • 10~ 2 


2.5 


7.1 • 10~ 7 


0.69 



Table 1: Several separation measures for the transport problem with n = 4 



For n = 4 and certain values of f3 such that (a, c) = (/3, 1 — /3), we compute the absolute 
gap of H (gap("H) = |A n — A n+ i|), the relative sep of its stable subspace (rsep(W), 
defined in Section [272]) . the Cay ley-transformed gap (gap c (%), defined in and the 

same quantities for the shifted matrix 7i, where we have chosen 7 = 7* of (jSJ) in both 
cases and s = 6,3/^,1 — 1, where & are the eigenvalues of H ordered by nondecreasing 
modulus. We have computed moreover the relative sep of the central subspace, indicated 
by rsep(Z^). The results are reported in Table [TJ 

Since the conditioning of an invariant subspace is proportional to the reciprocal of the 
relative sep, we observe that the central invariant subspace is much better conditioned 
than the stable subspace and that the conditioning of the shifted problem is not worse 
than the one of the original problem. 

Recall that gap c (%) and gap c ^(%) are the parameters of quadratic convergence of 
the SDA in the nonshifted and the shifted case, respectively. If gap c ^(-) is near to 1, 
we expect a large number of steps of SDA to obtain the desired accuracy. This suggest 
that the SDA applied to the shifted equation converge much faster, as shown in the next 
tests. 

Test 2 We consider the transport problem of Test [lj to which the subspace shift algo- 
rithm is applied, where the Riccati equations are solved with the SDA. 

In Table we give the number of SDA iterations needed to get the best relative 
residual for different matrix sizes n and choices of the parameters f3 (in a stand-alone 
implementation the number of iterations may be slightly larger due to a non optimal 
stopping criterion). We provide in parentheses the number of orthogonal iterations 
needed to approximate the central invariant subspace in the shifted case, where the shift 
parameter is chosen with the approximation strategy of Section 14.21 

As f3 approaches zero, the problem becomes close-to-critical; in fact (3 is strictly related 
to the relative gap. The table reports also gap("H) and the minimum distance 5 from 
the two eigenvalues A ra and A n+ i to the other eigenvalues of H. 

As one can see the problem is well suited to be solved by our algorithms since the 
central eigenvalues are well separated from the others and 5 is always large enough while 
the gap goes to zero; this shows that this example fits adequately the assumptions of 
Section 13.31 

Test 3 The second example is taken from [TU]. The matrix A4 is a random M-matrix 
of size n and depending on a parameter a. As a tends to 0, the matrix A4 tends to 
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n 


P 


gap(H) 


5 


SDA its 


SDA res 


Alg 1 its 


Alg 


1 res 


32 


io~ 3 


0.11 


0.96 


15 


8.8 • 10~ 15 


11 (12) 


4.2 


10 -IB 


32 


10~ 6 


3.5 • 10~ 3 


1.0 


20 


1.0 • 10~ 14 


11 (6) 


1.1 


1Q -16 


32 


io- 12 


3.5 • 10~ 6 


1.0 


27 


8.1 • 10~ 15 


11 (3) 


1.1 


io- 16 


128 


10~ 3 


0.11 


0.95 


17 


1.2 • 10~ 13 


13 (12) 


7.7 


io- 15 


128 


10~ 6 


3.5 • 10~ 3 


1.0 


21 


8.0 • 10~ 13 


13 (6) 


3.6 


1Q -16 


128 


io- 12 


3.5 • 10~ 6 


1.0 


30 


1.5 • IO" 13 


12 (4) 


2.7 


io- 16 



Table 2: Number of iterations for Algorithm 1 vs. SDA on the transport problem 



n 


a 


gap (ft) 


5 


SDA its 


SDA res 


Alg 1 its 


Alg 


1 res 


50 


10~ 3 


0.43 


43 


12 


3.9 • 10~ 16 


4(7) 


1.1 


10 -16 


100 


10~ 3 


0.61 


90 


13 


6.1 • 10~ 16 


4(7) 


1.9 


io- 16 


32 


io- 12 


1.1 


185 


13 


8.6 • 10~ 16 


4(7) 


3.3 


1Q -16 



Table 3: Number of iterations for Algorithm 1 vs. SDA on the problem of Test [3] 



a singular matrix. The problem is not close-to-critical, however, there are two central 
eigenvalues well separated from the others, so the subspace shift algorithm works fine. 

In Table [3] we report the results for this problem. The effectiveness of the subspace 
shift algorithm suggests the possibility to use it in particular problems where a fistful of 
small eigenvalues are well separated from the others. 

Test 4 The residual is not always a good measure of the accuracy of the solution of a 
matrix equation. To test the accuracy of the subspace shift algorithm we consider the 
problems of Test [1] and Test [3] and compute the solution with double precision to get a 
solution X* exact up to 8 significant digits. Then we run the customary SDA and the 
SuShi algorithm with precision 10~ 8 (single precision). 

The relative error is essentially the same in both cases. For instance, for Test Q] with 
n = 4 and a = 10 -3 we get for both errors 1.8 • 10 -7 , for Test with n = 100 and 
a = 10 -3 we get for both errors 1.4 • 10 -7 . 

6 Conclusions 

We have provided a generalization of the shift technique which is aimed to handle close- 
to-critical nonsymmetric algebraic Riccati equations. The technique consists in comput- 
ing explicitly the (hopefully moderate-sized) invariant subspace relative to the smallest 
eigenvalues, which are responsible for the slow convergence of the solution algorithms, 
and modifying the problem in order to remove them. A theoretical analysis is outlined, 
not only in terms of eigenvalue location, but also using the more powerful separation 
metric, which is the one related to the conditioning of the problem; numerical experi- 
ments are presented and prove that the application of the shift technique is effective on 
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the analyzed problems. 
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